276 research outputs found
Automating the construction of scene classifiers for content-based video retrieval
This paper introduces a real time automatic scene classifier within content-based video retrieval. In our envisioned approach end users like documentalists, not image processing experts, build classifiers interactively, by simply indicating positive examples of a scene. Classification consists of a two stage procedure. First, small image fragments called patches are classified. Second, frequency vectors of these patch classifications are fed into a second classifier for global scene classification (e.g., city, portraits, or countryside). The first stage classifiers can be seen as a set of highly specialized, learned feature detectors, as an alternative to letting an image processing expert determine features a priori. We present results for experiments on a variety of patch and image classes. The scene classifier has been used successfully within television archives and for Internet porn filtering
Multi-Level Visual Alphabets
A central debate in visual perception theory is the argument for indirect versus direct perception; i.e., the use of intermediate, abstract, and hierarchical representations versus direct semantic interpretation of images through interaction with the outside world. We present a content-based representation that combines both approaches. The previously developed Visual Alphabet method is extended with a hierarchy of representations, each level feeding into the next one, but based on features that are not abstract but directly relevant to the task at hand. Explorative benchmark experiments are carried out on face images to investigate and explain the impact of the key parameters such as pattern size, number of prototypes, and distance measures used. Results show that adding an additional middle layer improves results, by encoding the spatial co-occurrence of lower-level pattern prototypes
Real time automatic scene classification
This work has been done as part of the EU VICAR (IST) project and the EU SCOFI project (IAP). The aim of the first project was to develop a real time video indexing classification annotation and retrieval system. For our systems, we have adapted the approach of Picard and Minka [3], who categorized elements of a scene automatically with so-called ’stuff’ categories (e.g., grass, sky, sand, stone). Campbell et al. [1] use similar concepts to describe certain parts of an image, which they named “labeled image regions”. However, they did not use these elements to classify the topic of the scene. Subsequently, we developed a generic approach for the recognition of visual scenes, where an alphabet of basic visual elements (or “typed patches”) is used to classify the topic of a scene. We define a new image element: a patch, which is a group of adjacent pixels within an image, described by a specific local pixel distribution, brightness, and color. In contrast with pixels, a patch as a whole can incorporate semantics. A patch is described by a HSI color histogram with 16 bins and by three texture features (i.e., the variance and two values based on the two eigen values of the covariance matrix of the Intensity values of a mask ran over the image. For more details on the features used we refer to Israel et al. [2]. We aimed at describing each image as a vector with a fixed size and with information about the position of patches that is not strict (strict position would limit generalization). Therefore, a fixed grid is placed over the image and each grid cell is segmented into patches, which are then categorized by a patch classifier. For each grid cell a frequency vector of its classified patches is calculated. These vectors are concate- nated. The resulting vector describes the complete image. Several grids were applied and several patch sizes with the grid cells were tested. Grid size of 3x2 combined with patches of size 16x16 provided the best system performance. For the two classification phases of our system, back-propagation networks were trained: (i) classification of the patches and (ii) classification of the image vector, as a whole. The system was tested on the classification of eight categories of scenes from the Corel database: interiors, city/street, forest, agriculture/countryside, desert, sea, portrait, and crowds. Each of these categories were relevant for the VICAR project. Based upon their relevance for these eight categories of scenes, we choose nine categories for the classification of the patches: building, crowd, grass, road, sand, skin, sky, tree, and water. This approach was found to be successful (for classification of the patches 87.5% correct, and classification of the scenes 73.8% correct). An advantage of our method is its low computational complexity. Moreover, the classified patches themselves are intermediate image representations and can be used for image classification, image segmentation as well as for image matching. A disadvantage is that the patches with which the classifiers were trained had to be manually classified. To solve this drawback, we currently develop algorithms for automatic extraction of relevant patch types. Within the IST project VICAR, a video indexing system was built for the Netherlands Institute for Sound and Vision1, consisting of four independent mod- ules: car recognition, face recognition, movement recognition (of people) and scene recognition. The latter module was based upon the afore mentioned approach. Within the IAP project SCOFI, a real time Internet pornography filter was built, based upon this approach. The system is currently running on several schools in Europe. Within the SCOFI filtering system, our image classification system (with a performance of 92% correct) works together with a text classi- fication system that includes a proxy server (FilterX, developed by Demokritos, Greece) to classify web-pages. Its total performance is 0% overblocking and 1% underblocking
Using photorespiratory oxygen response to analyse leaf mesophyll resistance
Classical approaches to estimate mesophyll conductance ignore differences in resistance components for CO2 from intercellular air spaces (IAS) and CO2 from photorespiration (F) and respiration (Rd). Consequently, mesophyll conductance apparently becomes sensitive to (photo)respiration relative to net photosynthesis, (F + Rd)/A. This sensitivity depends on several hard-to-measure anatomical properties of mesophyll cells. We developed a method to estimate the parameter m (0 ≤ m ≤ 1) that lumps these anatomical properties, using gas exchange and chlorophyll fluorescence measurements where (F + Rd)/A ratios vary. This method was applied to tomato and rice leaves measured at five O2 levels. The estimated m was 0.3 for tomato but 0.0 for rice, suggesting that classical approaches implying m = 0 work well for rice. The mesophyll conductance taking the m factor into account still responded to irradiance, CO2, and O2 levels, similar to response patterns of stomatal conductance to these variables. Largely due to different m values, the fraction of (photo)respired CO2 being refixed within mesophyll cells was lower in tomato than in rice. But that was compensated for by the higher fraction via IAS, making the total re-fixation similar for both species. These results, agreeing with CO2 compensation point estimates, support our method of effectively analysing mesophyll resistance.</p
Українська культура як чинник української державності: історичний аспект
У статті подано особливості формування української національної культури кінця XIX -
XX століть. Охарактеризовано розвиток культури за умови становлення української державності. Автор намагався простежити вплив національного суспільства на розвиток культури в конкретний історичний період.The article introduces the features of formation of the Ukrainian national culture of the 19th-20th centuries. Characterized by
the development of culture during becoming of Ukrainian statehood. The author tried to trace the influence of the national
society for the development of culture in a specific historical period
Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests
To what degree should we ascribe cognitive capacities to Large Language
Models (LLMs), such as the ability to reason about intentions and beliefs known
as Theory of Mind (ToM)? Here we add to this emerging debate by (i) testing 11
base- and instruction-tuned LLMs on capabilities relevant to ToM beyond the
dominant false-belief paradigm, including non-literal language usage and
recursive intentionality; (ii) using newly rewritten versions of standardized
tests to gauge LLMs' robustness; (iii) prompting and scoring for open besides
closed questions; and (iv) benchmarking LLM performance against that of
children aged 7-10 on the same tasks. We find that instruction-tuned LLMs from
the GPT family outperform other models, and often also children. Base-LLMs are
mostly unable to solve ToM tasks, even with specialized prompting. We suggest
that the interlinked evolution and development of language and ToM may help
explain what instruction-tuning adds: rewarding cooperative communication that
takes into account interlocutor and context. We conclude by arguing for a
nuanced perspective on ToM in LLMs.Comment: 14 pages, 4 figures, Forthcoming in Proceedings of the 27th
Conference on Computational Natural Language Learning (CoNLL
Dementia in People with Severe/Profound Intellectual (and Multiple) Disabilities:Applicability of Items in Dementia Screening Instruments for People with Intellectual Disabilities
Introduction: Diagnosing dementia in people with severe/profound intellectual (and multiple) disabilities (SPI(M)D) is complex. Whereas existing dementia screening instruments as a whole are unsuitable for this population, a number of individual items may apply. Therefore, this study aimed to identify applicable items in existing dementia screening instruments. Methods: Informant interviews about 40 people with SPI(M)D were conducted to identify applicable items in the Dementia Scale for Down Syndrome, Behavioral and Psychological Symptoms of Dementia in Down Syndrome II scale, Dementia Questionnaire for persons with Mental Retardation and Social competence Rating scale for people with Intellectual Disabilities. Results: Among 193 items, 101 items were found applicable, categorized in 5 domains: behavioral and psychological functioning (60 items), cognitive functioning (25), motor functioning (6), activities of daily living (5) and medical comorbidities (5). Conclusion: Identifying applicable items for people with SPI(M)D is an essential step in developing a dedicated dementia screening instrument for this population
- …